Technical Note TN2118
Kernel Core Dumps

This technote explains how you can enable remote kernel core dumps on Mac OS X 10.3 and later. Kernel core dumps allow you to examine the state of the kernel after a kernel panic or hang. You can use this to debug a kernel problem when circumstances prevent you from using the two-machine kernel debugger.

This technote is targeted at two distinct audiences. Kernel extension developers can use this information to debug kernel panics or hangs that they can't reproduce locally. In addition, system administrators can use this information to record details about which machines are panicking and why, and to perform offline debugging on high-availability systems.





Introduction

The Mac OS X kernel should never panic because, when it does, it seriously inconveniences the user. Thus, a kernel panic is always the result of a bug, either in Apple's code or in the code of some third party kernel extension. Such bugs need to be investigated and resolved.

There are a number of circumstances in which the ability to capture a kernel core dump is useful.

  • When you're writing a kernel extension and you encounter a kernel panic, you can usually debug the problem with the two-machine kernel debugger. However, there are circumstances where this isn't possible. For example, if a tester or end user reports a problem that you can't reproduce on your desktopóeither because it happens infrequently or because it only happens with obscure hardware or in a non-standard environmentóyou will not be able to debug using the standard tools. In these circumstances it's helpful if you can capture a core dump of the panicked kernel and debug using that core dump.

  • If you manage a large group of Macintosh computers, you might want to monitor which computers are panicking and why. You can use this information to determine how frequently kernel panics occurs, whether there are any common symptoms, and, most importantly, whether any third party kernel extensions are involved.

  • Finally, if you manage a high-availability Macintosh server and you have problems with the server panicking, you can capture a kernel core dump, immediately restart the server, and then debug the problem offline.

To assist you in these situations, Apple has introduced a remote kernel core dump facility in Mac OS X 10.3. You can configure a Mac OS X computer so that, when the machine panics, it transmits a core dump of the kernel to a server via TCP/IP. The core dump server is a daemon that collects the kernel core dump from the client and writes it to disk. You can then analyze the core dump using a variety of tools, most notably GDB.

IMPORTANT: The kernel core dump client is built in to the kernel on Mac OS X 10.3 and later. In addition, the kernel core dump server, kdumpd, is installed by default on Mac OS X 10.3 and later.

The source for the kernel core dump server is part of Darwin (in the network_cmds) project). It should be feasible to port this server to other UNIX-like platforms, including earlier versions of Mac OS X, although such endeavours are not covered by this technote.

This technote describes how to set up a kernel core dump server and how to configure another computer to dump to that server. It also explains how to test your setup. Finally, and this is mostly for those developing kernel extensions, it explains how you can debug using the resulting kernel core dumps.

Back to Top 

Configuring the Server

The first step in collecting kernel core dumps is to set up a kernel core dump server. To start you must choose a machine to act as the server, taking into account the following points.

  • The kernel core dump server is not computationally expensive; it will run easily on any machine capable of running Mac OS X.

  • Kernel core dumps are large, typically running to the order of 100 MB (this varies based on the client's kernel map size, physical memory size, usage patterns, and so on). Your server should have a lot of free disk space.

  • The server must have a static IP address.

  • The server's IP address must be visible to all of the clients. You can't place the server behind a NAT or firewall unless your clients are also behind the same NAT or firewall.

  • When a client panics, it transmits the contents of kernel memory to the server in the clear. It's quite possible that this data includes sensitive information. You should configure your network (using, for example, switched hubs, a firewall, or a VPN) so that this data can't be seen by unauthorized persons.

Note: The kernel dump protocol uses UDP on port 1069. Currently this is not configurable.

To enable the kernel core dump server, execute the following steps.

Create core dump directory

To start, you must create a directory into which the server writes core dumps. Listing 1 shows an example of how to do this for the /PanicDumps directory. You have to make the directory world writable because the server process runs as user nobody and can't modify the directory otherwise.

Listing 1: Creating the core dump directory.

server$ mkdir /PanicDumps
server$ chmod ugo+w /PanicDumps

IMPORTANT: The rest of this document assumes that you use the /PanicDumps directory to store the kernel core dumps. If you use a different directory in this step, you will have to substitute the full path to that directory in place of /PanicDumps in subsequent steps.

Configure xinetd

The next step is to configure extended Internet services daemon, xinetd, to run the server when a client connects. You do this by copying the text from Listing 2 into a file called macosxkdump in the /etc/xinetd.d directory. Listing 3 shows one way to do this.

Listing 2: Contents of the 'macosxkdump' configuration file

service macosxkdump
{
    disable     = no
    type        = UNLISTED
    socket_type = dgram
    protocol    = udp
    port        = 1069
    user        = nobody
    groups      = yes
    server      = /usr/libexec/kdumpd
    server_args = /PanicDumps
    wait        = yes
}

Listing 3: Creating the 'macosxkdump' file

server$ cat > /tmp/macosxkdump
service macosxkdump
{
    disable     = no
    type        = UNLISTED
    socket_type = dgram
    protocol    = udp
    port        = 1069
    user        = nobody
    groups      = yes
    server      = /usr/libexec/kdumpd
    server_args = /PanicDumps
    wait        = yes
}
^D
server$ ls -l /tmp/macosxkdump
-rw-r--r--  1 quinn  staff  278 14 Jul 15:25 /tmp/macosxkdump
server$ sudo cp /tmp/macosxkdump /etc/xinetd.d
Password: ********
server$ ls -l /etc/xinetd.d/macosxkdump
-rw-r--r--  1 root  wheel  278 14 Jul 15:26 /etc/xinetd.d/macosxkdump

IMPORTANT: To maintain system security, the configuration file (/etc/xinetd.d/macosxkdump) should have the permissions shown by the final command in Listing 3.

Signal xinetd

You must send the SIGHUP signal to the xinetd daemon for it to recognize your configuration changes. Listing 4 shows how to do this. When entering the first command, make sure to use backquotes (ASCII 96), not single quotes (ASCII 39).

Listing 4: Signalling xinetd

server$ sudo kill -HUP `cat /var/run/xinetd.pid`
Password: ********

Alternatively, you can simply reboot the machine, which also causes xinetd to recognize the new configuration.

Confirm configuration

You can confirm that everything is configured correctly using one of two methods.

  • If you look in the system log you will see text similar to that shown in Listing 5 indicating that xinetd has started a new service. You can view the system log manually (/var/log/system.log) or in the Console application.

  • You can send a SIGUSR1 signal to xinetd to ask it to dump its current configuration. Listing 6 shows how to do this. The daemon appends the information to /var/run/xinetd.dump. You should see a record indicating that the macosxkdump service is active, similar to the one shown in Listing 7.

Listing 5: System log confirmation

Jul 14 15:40:57 localhost xinetd[349]: Starting reconfiguration
Jul 14 15:40:57 localhost xinetd[349]: readjusting service ssh
Jul 14 15:40:57 localhost xinetd[349]: Reconfigured: new=1 old=1 dropped=0 (services)

Listing 6: Signalling xinetd to dump its configuration

server$ sudo kill -USR1 `cat /var/run/xinetd.pid`
Password: ********

Listing 7: xinetd dump confirmation

Service = macosxkdump
    State = Active
    Service configuration: macosxkdump
        id = macosxkdump
        flags = IPv4
        type = UNLISTED
        socket_type = dgram
        Protocol (name,number) = (udp,17)
        port = 1069
        wait = yes
        user = -2
        Groups = yes
        PER_SOURCE = -1
        Bind = All addresses.
        Server = /usr/libexec/kdumpd
        Server argv = kdumpd /PanicDumps
        Only from: All sites
        No access: No blocked sites
        Logging to syslog. Facility = daemon, level = info
        Log_on_success flags = HOST PID HOST
        Log_on_failure flags = HOST
    running servers = 0
    retry servers = 0
    attempts = 0
    service fd = 6

Once you have the server configured correctly, you can proceed on to the next step, which is configuring the clients.

Back to Top 

Configuring a Client

To enable kernel core dumps on a client machine, you must modify the boot-args Open Firmware variable to include two arguments.

  • You must set the 0x0400 flag in the debug argument. Furthermore, there are a number of other useful flags for this argument which are described in detail below.

  • You must set the _panicd_ip argument to the IP address of the kernel core dump server. You must use an IPv4 address in dotted decimal notation; IPv6 and DNS addresses are not supported.

Listing 8 shows an example of how to set the boot-args Open Firmware variable, assuming that the kernel core dump server's IP address is 10.0.40.2.

Listing 8: Setting the boot-args Open Firmware variable

client$ sudo nvram boot-args="debug=0xd44 _panicd_ip=10.0.40.2"
Password: ********

You must restart to enable this setting.

IMPORTANT: The boot-args Open Firmware variable is reset whenever you install new system software, including software updates, and whenever you change the startup disk using System Preferences.

Debug flags in depth

You can use the boot-args debug flags to control various aspects of kernel debugging. Many of these flags are documented in existing documentation. Table 1 describes the new flags introduced with Mac OS X 10.3, along with three old flags that are handy when generating kernel core dumps. Table 2 shows how to combine these flags to effect various useful behaviors.

Table 1: Debug flags

Symbolic NameFlagNewDescription
DB_NMI0x0004NoActivates the kernel debugging facility, including support for NMI (the programmer's switch on the front of the computer; see Technical Q&A QA1264, 'Generating an NMI Without a Programmer's Switch' if your machine does not have a programmer's switch).
DB_ARP0x0040NoAllows the kernel debugger nub to use ARP and thus support debugging across subnets. You would typically enable this flag when collecting kernel core dumps.
DB_LOG_PI_SCRN0x0100NoDisable the graphical panic screen. You typically want to do this when you enable kernel core dumps so that you can see progress for the kernel core dump transmission.
DB_KERN_DUMP_ON_PANIC0x0400YesCauses the kernel to core dump when the system panics.
DB_KERN_DUMP_ON_NMI0x0800YesCauses the kernel to core dump when the user triggers an NMI.
DB_DBG_POST_CORE0x1000YesControls the kernel's behavior after dumping core in response to an NMI (DB_KERN_DUMP_ON_NMI). If the user triggers an NMI and this flag is clear, the kernel will dump core and then continue. Conversely, if this flag is set the kernel will dump core and then wait for a debugger connection.
DB_PANICLOG_DUMP0x2000YesControls whether the kernel dumps a full core (if the flag is clear) or simply a panic log (if the flag is set).

Table 2: Useful debug flag combinations

ValueScenario
0x0044Used for day-to-day two-machine debugging. DB_NMI allows you to enter the kernel debugger by triggering NMI. DB_ARP lets you debug without futzing around with permanent ARP table entries.
0x0444Used for capturing kernel core dumps. DB_NMI is on in order to activate kernel debugging. DB_ARP is on, as explained above. DB_KERN_DUMP_ON_PANIC activates the kernel core dump facility.
0x2444Used for capturing kernel panic logs. The flags are set per the previous row except that DB_PANICLOG_DUMP is set, causing the kernel to generate panic logs rather than core dumps.
0x0844Useful when the user reports mysterious kernel-level freezes. When a freeze occurs, the user can trigger NMI and the system generates a kernel core dump and then continues. There's no need to set DB_LOG_PI_SCRN because it has no effect in this case.
0x0d44Useful for testing the kernel core dump facility. This generates a kernel core dump if either a panic occurs or the user triggers NMI. DB_LOG_PI_SCRN is set so that you can see progress for the kernel core dump transmission.

Back to Top 

Testing Your Configuration

Once you have enabled the server and configured the client, it's time to test things to make sure they're working as expected. To test the system you need to trigger a kernel panic. To make this easy this technote includes a simple kernel extension, InstantPanic, that panics the kernel as soon as you load it. You can get the source and binary for this kernel extension from the Downloadables section.

You should start by downloading and unarchiving the kernel extension on your desktop. You can then load the kernel extension using the commands shown in Listing 9.

Listing 9: Loading the 'InstantPanic' kernel extension

client$ cd ~/Desktop/InstantPanic/build/
client$ sudo cp -r InstantPanic.kext /
Password: ********
client$ sudo kextload /InstantPanic.kext

Note: In Listing 9 we make a copy of the kernel extension as root (using sudo) in order to guarantee that it has the right file permissions. You cannot load a kernel extensions with the wrong file permissions. If kextload prints a message saying that the KEXT is not authentic, you can fix the permissions using sudo chown -R root:wheel /InstantPanic.kext.

Executing the commands in Listing 9 will cause the client machine to panic. Then, if all goes well, the client will transmit a kernel core dump to the server. You can see transmission progress on the screen of the client (assuming that you set the DB_LOG_PI_SCRN debug flag). Once the transfer is complete you can list the /PanicDumps directory on the server to see the new panic dump file.

Listing 10: An example kernel core dump

server$ ls -l /PanicDumps
total 216872
-rw-rw----  1 nobody  admin  111038464 14 Jul 17:43 core-xnu-517-10.0.40.7-e58299ec

The name of this file includes the kernel versionóas displayed by uname -a, in this case 517óand the client's IP address (10.0.40.7), along with a unique timestamp.

Note: The name does not include the kernel's minor version. This is a known issue that should be resolved in a future system release (r. 3735061).

The next section explains how you can debug a kernel panic using this file. Alternatively, if you set the DB_PANICLOG_DUMP flag, the /PanicDumps directory will contain a panic log file, much like the file you'd get in /Library/Logs after a panic. Listing 11 shows an example of this.

Listing 11: An example panic log file

server$ ls -l /PanicDumps
total 216872
-rw-rw----  1 nobody  staff        738 14 Jul 10:53 paniclog-xnu-517-10.0.40.7-74da81e2
server$ cat paniclog-xnu-517-10.0.40.7-74da81e2
panic(cpu 0): InstantPanic: Just add water!
Latest stack backtrace for cpu 0:
      Backtrace:
         0x000833B8 0x0008389C 0x0001ED8C 0x144ED038 0x144ED0E0 \
0x0007FE28 0x00080094 0x0003CF6C
         0x00021650 0x0001BCD0 0x0001C0D8 0x00093D58 0x006D0069
      Kernel loadable modules in backtrace (with dependencies):
         com.apple.dts.kext.InstantPanic(1.0)@0x144ec000
Proceeding back via exception chain:
   Exception state (sv=0x1C8C3280)
      PC=0x900075C8; MSR=0x0200F030; DAR=0x144ED19C; DSISR=0x40000000; \
LR=0x90007118; R1=0xBFFFF410; XCP=0x00000030 (0xC00 - System call)

Kernel version:
Darwin Kernel Version 7.0.0:
Wed Sep 24 15:48:39 PDT 2003; root:xnu/xnu-517.obj~1/RELEASE_PPC

For more information about kernel panic logs, see Technical Note TN2063, 'Understanding and Debugging Kernel Panics'.

Back to Top 

Debugging with Kernel Core Dumps

If you're a kernel programmer, you can use kernel core dumps to debug kernel panics and hangs. Before you start, you should collect together the following helpful resources.

  • The Kernel Debug Kit for the kernel that panicked. You can get Kernel Debug Kits from the Apple developer web site.

    The rest of this section assumes that you have mounted the correct Kernel Debug Kit disk image on your machine.

  • The Darwin source code for the kernel that panicked. You can get Darwin source code from the Darwin page on the Apple developer site. The source for the kernel is held in the xnu project.

    The Darwin kernel source is virtually identical to the source used to build the Mac OS X kernel. In almost all cases you can use the Darwin source to do meaningful source-level debugging of the Mac OS X kernel.

    The rest of this section assumes that the machine that panicked is running Mac OS X 10.3, which corresponds to xnu-517. Furthermore, it assumes that you have downloaded the xnu-517 source and that you unarchived it to a folder on your desktop call xnu-517.

IMPORTANT: You should download the resources that correspond to the kernel version on the client machine (the machine that panicked).

The first step to debugging with a kernel core file is to open that file in GDB using the -c option. Listing 12 shows an example of this. Once you've opened the core file in GDB, you can inspect the state of the kernel using standard GDB commands. This example uses bt to display a backtrace of the stack.

Listing 12: Opening a kernel core file in GDB

server$ gdb -c /PanicDumps/core-xnu-517-10.0.40.7-e58299ec
GNU gdb 5.3-20030128 (Apple version gdb-309) (Thu Dec  4 15:41:30 GMT 2003)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc-apple-darwin".
unable to read unknown load command 0x0
#0  0x00083a3c in ?? ()
(gdb) bt
#0  0x00083a3c in ?? ()
#1  0x00083a3c in ?? ()
#2  0x14406038 in ?? ()
#3  0x144060e0 in ?? ()
#4  0x0007fe28 in ?? ()
#5  0x00080094 in ?? ()
#6  0x0003cf6c in ?? ()
#7  0x00021650 in ?? ()
#8  0x0001bcd0 in ?? ()
#9  0x0001c0d8 in ?? ()
#10 0x00093d58 in ?? ()
#11 0x003e0045 in ?? ()

Note: In Xcode 1.2 and earlier, GDB does not allow you to have spaces in the pathname that you supply to the -c argument (r. 3727326).

As you can see, this backtrace contains no symbolic information. You can fix this by loading kernel symbols from the Kernel Debug Kit. Listing 13 shows an example of this. Once you load the kernel symbols, the backtrace is now much more helpful.

Listing 13: Loading kernel symbols

(gdb) add-symbol-file /Volumes/KernelDebugKit/mach_kernel
add symbol table from file "/Volumes/KernelDebugKit/mach_kernel"? (y or n) y
Reading symbols from /Volumes/KernelDebugKit/mach_kernel...done.
(gdb) bt
#0  Debugger (message=0x2952a0 "panic") at /SourceCache/xnu/xnu-517/osfmk/ppc/model_deÖ
#1  0x0001ed8c in panic (str=0x2952a0 "panic") at /SourceCache/xnu/xnu-517/osfmk/kern/Ö
#2  0x0001ed8c in panic (str=0x1440616c "InstantPanic: Just add water!") at /SourceCacÖ
#3  0x14406038 in ?? ()
#4  0x144060e0 in ?? ()
#5  0x0007fe28 in kmod_start_or_stop (id=339763600, start=339763600, data=0x105dd2c, dÖ
#6  0x00080094 in kmod_control (host_priv=0xd9b000, id=1143226596, flavor=1, data=0x10Ö
#7  0x0003cf6c in _Xkmod_control (InHeadP=0x105dd10, OutHeadP=0x10bb110) at mach/host_Ö
#8  0x00021650 in ipc_kobject_server (request=0x2e0000) at /SourceCache/xnu/xnu-517/osÖ
#9  0x0001bcd0 in mach_msg_overwrite_trap (msg=0xbffff450, option=17161488, send_size=Ö
#10 0x0001c0d8 in mach_msg_trap (msg=0xd9b000, option=165072, send_size=1, rcv_size=0,Ö
#11 0x00093d58 in .L_kernel_syscall ()
#12 0x003e0045 in zombproc ()

Now that you have symbols, you're only a short step away from source-level debugging. As you can see from frame 1 of the backtrace, GDB expects to find the kernel source in the directory /SourceCache/xnu/xnu-517. You can meet that expectation by virtue of a well-place symbolic link. Listing 14 shows how to create this link.

Listing 14: Creating a symlink so that GDB can find the source

server$ mkdir -p /SourceCache/xnu
server$ ln -s ~/Desktop/xnu-517 /SourceCache/xnu

Now you can look back up the stack (using the frame command) and get actual source code listings (using the list command). Listing 15 shows an example of this.

Listing 15: Source-level debugging

(gdb) frame 1
#1  0x0001ed8c in panic (str=0x2952a0 "panic") at /SourceCache/xnu/xnu-517/osfmk/kern/Ö
198              * Release panicwait indicator so that other cpus may call Debugger().
(gdb) list
193             _doprnt(str, &listp, consdebug_putc, 0);
194             va_end(listp);
195             kdb_printf("\n");
196
197             /*
198              * Release panicwait indicator so that other cpus may call Debugger().
199              */
200             panicwait = 0;
201             Debugger("panic");
202             /*

Last, but certainly not least, you can use the kernel debugging macros on a kernel core dump in the same way you would on a live kernel. Listing 16 shows how to load up the kernel debugging macros and execute two of the most useful macros.

  • paniclog prints the standard panic information

  • showallstacks prints a backtrace for everything thread running within the kernel

Listing 16: Kernel debugging macros in action

(gdb) source /Volumes/KernelDebugKit/kgmacros
Loading Kernel GDB Macros package.  Type "help kgm" for more info.
(gdb) paniclog
panic(cpu 0): InstantPanic: Just add water!
Latest stack backtrace for cpu 0:
      Backtrace:
         0x000833B8 0x0008389C 0x0001ED8C 0x14421038 0x144210E0 \
0x0007FE28 0x00080094 0x0003CF6C
         0x00021650 0x0001BCD0 0x0001C0D8 0x00093D58 0x00700070
      Kernel loadable modules in backtrace (with dependencies):
         com.apple.dts.kext.InstantPanic(1.0)@0x14420000
Proceeding back via exception chain:
   Exception state (sv=0x1C926500)
      PC=0x900075C8; MSR=0x0200F030; DAR=0x1442119C; DSISR=0x40000000; \
LR=0x90007118; R1=0xBFFFF410; XCP=0x00000030 (0xC00 - System call)

Kernel version:
Darwin Kernel Version 7.0.0:
Wed Sep 24 15:48:39 PDT 2003; root:xnu/xnu-517.obj~1/RELEASE_PPC


(gdb) showallstacks
[...]
task        vm_map      ipc_space  #acts   pid  proc        command
0x00c94980  0x009438dc  0x00c866b0    1    384  0x00e28a08  kextload
            activation  thread      pri  state  wait_queue  wait_event
            0x00d9b000  0x00d9b000   26  R
                kernel_stack=0x076d8000
                stacktop=0x076dbb30
                0x076dbb30  0x83a3c <Debugger+524>
                0x076dbbb0  0x1ed8c <panic+472>
                0x076dbc30  0x14406038
                0x076dbc80  0x144060e0
                0x076dbcd0  0x7fe28 <kmod_start_or_stop+224>
                0x076dbd40  0x80094 <kmod_control+128>
                0x076dbda0  0x3cf6c <_Xkmod_control+256>
                0x076dbe00  0x21650 <ipc_kobject_server+284>
                0x076dbe50  0x1bcd0 <mach_msg_overwrite_trap+2992>
                0x076dbf20  0x1c0d8 <mach_msg_trap+28>
                0x076dbf70  0x93d58 <.L_mach_return>
                0x076dbfc0  0x3e0045 <com.apple.driver.AppleI2C + 0x4045>
                stackbottom=0x076dbfc0

You can learn more about the kernel debugging macros in the I/O Kit documentation.

Note: The switchtoact macro does not currently work on kernel core files (r. 3401283).

Back to Top 

Conclusion

The kernel core dump facility is a useful debugging tool for both kernel extension developers and users with large or complex Macintosh installations. Using this facility, you can capture information about kernel panics (and kernel hangs) where it's not possible to use the two-machine kernel debugger.

Back to Top 

Further Reading

Back to Top 

Downloadables

Back to Top 

Document Revision History

DateNotes
2004-10-27Use mkdir -p to create the SourceCache path in one command. Mention the paniclog kernel debugging macro. Put the working copy of macosxkdump in /tmp. Note that changing startup disk also resets boot-args. Reference Q&A 1264 in the table entry describing DB_NMI.
2004-08-19First Version

Posted: 2004-10-27